Below are the datasets I plan to use for my story.
https://data-msdis.opendata.arcgis.com/datasets/mo-public-school-districts/explore
https://data-msdis.opendata.arcgis.com/datasets/mo-2020-public-schools/explore
We always hear about the urban rural divide so I wanted to highlight differences between more populated counties and more rural counties. More speciifically looking at rural vs urban school districts.
I'd like to examine student/teacher ratios and school density by land area.
My target audience is Missouri residents who are interested in education and differences between rural and urban areas. My hope with this project is to provide insights into the differences rural and urban counties experience pertaining to educational facilities and resources. People viewing this story will have a better understanding of possible challenges districts face regarding class size and district transportation.
## load packages
from pathlib import Path
import urllib.request
import shutil
import geopandas as gpd
import pandas as pd
import json
import folium
from folium.plugins import MarkerCluster
from branca.colormap import linear
pd.set_option('display.max_columns', None)
file_url = 'https://services2.arcgis.com/kNS2ppBA4rwAQQZy/ArcGIS/rest/services/MO_Public_School_Districts/FeatureServer/0?f=pjson'
local_file_name = 'MO_Public_School_Districts.json'
file_path = Path('../exercises/')
file_path /= local_file_name
with urllib.request.urlopen(file_url) as response, file_path.open(mode = 'w+b') as out_file:
shutil.copyfileobj(response, out_file)
##load the districts
mo_districts = gpd.read_file('MO_Public_School_Districts.shp', layer = 0)
##Import schools
mo_schools = gpd.read_file('MO_2020_Public_Schools.shp', layer = 0)
mo_districts.head()
| FID | STATEFP | ELSDLEA | GEOID | NAME | DIST_NAME | DIST_CODE | CODIST | COUNTY | LSAD10 | LOGRADE | HIGRADE | MTFCC | SDTYP | FUNCSTAT | ALAND | AWATER | INTPTLAT | INTPTLON | Area_SqMil | Shape__Are | Shape__Len | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 29 | None | 2920490 | Maryville R-II School District | Maryville R-II | 074201 | 074-201 | Nodaway | 00 | PK | 12 | G5420 | None | E | 324365817 | 612177 | +40.3426688 | -094.8788717 | 125.453015 | 5.600208e+08 | 119623.574639 | POLYGON ((-94.92889 40.43306, -94.92791 40.433... |
| 1 | 2 | 29 | None | 2903480 | Atlanta C-3 School District | Atlanta C-3 | 061150 | 061-150 | Macon | 00 | KG | 12 | G5420 | None | E | 418773072 | 7996637 | +39.8997878 | -092.5182534 | 164.652172 | 7.260594e+08 | 204808.140357 | POLYGON ((-92.73476 40.00487, -92.73175 40.004... |
| 2 | 3 | 29 | None | 2918540 | Liberty 53 School District | Liberty 53 | 024090 | 024-090 | Clay | 00 | PK | 12 | G5420 | None | E | 212701169 | 1418667 | +39.2551406 | -094.3973310 | 82.635664 | 3.575781e+08 | 122716.191316 | POLYGON ((-94.49204 39.30984, -94.49204 39.310... |
| 3 | 4 | 29 | None | 2928430 | South Callaway Co. R-II School District | South Callaway Co. R-II | 014130 | 014-130 | Callaway | 00 | PK | 12 | G5420 | None | E | 498480796 | 9610382 | +38.7552288 | -091.8221408 | 196.068682 | 8.365335e+08 | 190280.448659 | POLYGON ((-91.64232 38.84370, -91.64235 38.843... |
| 4 | 5 | 29 | None | 2931440 | Waynesville R-VI School District | Waynesville R-VI | 085046 | 085-046 | Pulaski | 00 | PK | 12 | G5420 | None | E | 485531102 | 4910959 | +37.7655497 | -092.1553666 | 189.235301 | 7.860890e+08 | 187724.141675 | POLYGON ((-92.28531 37.90125, -92.28518 37.901... |
mo_schools.head()
| FID | CtyDist | SchNum | SchID | Facility | Address | Address2 | City | State | ZIP | County | Phone | FAX | BGrade | EGrade | Principal | PrinTitle | Teachers | Enrollment | Latitude | Longitude | Loc_Code | geometry | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 010093 | 3040 | 010093-3040 | Smithton Middle | 3600 W Worley | None | Columbia | MO | 652034679 | Boone | 5732143260 | 5732143261 | 06 | 08 | Mr. Chris Drury | Principal | 70 | 751 | cdrury@cpsk12.org | 38.959750 | -92.388944 | MAP_MU | POINT (-92.38895 38.95976) |
| 1 | 2 | 115115 | 4470 | 115115-4470 | Dewey School-Internat'L. Studies | 815 Ann Avenue | None | St. Louis | MO | 631044134 | St. Louis City | 3146454845 | 3146455926 | PK | 05 | Mr. Andrew Donovan | Principal | 31 | 433 | ANDREW.DONOVAN@SLPS.ORG | 38.630979 | -90.302469 | MAP_MU | POINT (-90.30240 38.63107) |
| 2 | 3 | 115115 | 4480 | 115115-4480 | Dunbar and Br. | 1415 N Garrison Avenue | None | St. Louis | MO | 631061506 | St. Louis City | 3145332526 | 3145330269 | PK | 06 | Mr. Anthony Virdure | Principal | 16 | 157 | ANTHONY.VIRDURE@SLPS.ORG | 38.645176 | -90.220644 | MAP_MU | POINT (-90.22058 38.64526) |
| 3 | 4 | 115906 | 1945 | 115906-1945 | Grand Center Arts Academy High | 711 N. Grand Avenue | None | St. Louis | MO | 631031029 | St. Louis City | 3145331791 | None | 09 | 12 | Ms. Ashley Olson | Head of School | 51 | 390 | ashley.olson@confluenceacademies.o* | 38.640595 | -90.230978 | MAP | POINT (-90.23080 38.64067) |
| 4 | 5 | 115916 | 6980 | 115916-6980 | Gateway Science Acad/St. Louis | 6576 Smiley Avenue | None | St. Louis | MO | 631392425 | St. Louis City | 3149327513 | 3149327514 | K | 05 | Mr. Nuh Celik | Principal | 34 | 424 | ncelik@gsastl.org | 38.606788 | -90.302452 | MAP | POINT (-90.30239 38.60687) |
mo_schools.describe()
| FID | Teachers | Enrollment | Latitude | Longitude | |
|---|---|---|---|---|---|
| count | 2392.000000 | 2392.000000 | 2392.000000 | 2392.000000 | 2392.000000 |
| mean | 1196.500000 | 36.155936 | 379.226589 | 38.406679 | -92.391508 |
| std | 690.655244 | 25.474640 | 336.203921 | 0.982586 | 1.718383 |
| min | 1.000000 | 0.000000 | 0.000000 | 36.042349 | -95.517960 |
| 25% | 598.750000 | 20.000000 | 157.000000 | 37.606746 | -94.142487 |
| 50% | 1196.500000 | 32.000000 | 319.500000 | 38.638314 | -92.544424 |
| 75% | 1794.250000 | 44.000000 | 485.250000 | 39.033227 | -90.534994 |
| max | 2392.000000 | 292.000000 | 2408.000000 | 40.551358 | -89.336075 |
mo_districts.describe()
| FID | ALAND | AWATER | Area_SqMil | Shape__Are | Shape__Len | |
|---|---|---|---|---|---|---|
| count | 556.000000 | 5.560000e+02 | 5.560000e+02 | 556.000000 | 5.560000e+02 | 556.000000 |
| mean | 278.500000 | 3.533803e+08 | 4.756741e+06 | 125.329984 | 5.294306e+08 | 135595.656717 |
| std | 160.647648 | 2.434082e+08 | 9.478520e+06 | 97.422157 | 4.131485e+08 | 69290.410992 |
| min | 1.000000 | 5.175745e+06 | 0.000000e+00 | 0.003682 | 1.580583e+04 | 535.917846 |
| 25% | 139.750000 | 1.802593e+08 | 5.919828e+05 | 59.426830 | 2.525088e+08 | 93420.363093 |
| 50% | 278.500000 | 2.957942e+08 | 1.708151e+06 | 104.898316 | 4.427092e+08 | 130482.861336 |
| 75% | 417.250000 | 4.629258e+08 | 4.955277e+06 | 166.577938 | 7.033748e+08 | 176140.129775 |
| max | 556.000000 | 1.308389e+09 | 1.117709e+08 | 507.046949 | 2.274117e+09 | 371187.544151 |
totalTeachers = mo_schools.groupby('CtyDist')['Teachers'].sum().to_frame().reset_index()
totalTeachers.columns = ['DIST_CODE', 'Teachers']
totalTeachers.head()
| DIST_CODE | Teachers | |
|---|---|---|
| 0 | 001090 | 31 |
| 1 | 001091 | 251 |
| 2 | 001092 | 31 |
| 3 | 002089 | 65 |
| 4 | 002090 | 21 |
totalStudents = mo_schools.groupby('CtyDist')['Enrollment'].sum().to_frame().reset_index()
totalStudents.columns = ['DIST_CODE', 'Students']
totalStudents.head()
| DIST_CODE | Students | |
|---|---|---|
| 0 | 001090 | 246 |
| 1 | 001091 | 2549 |
| 2 | 001092 | 148 |
| 3 | 002089 | 365 |
| 4 | 002090 | 187 |
##Sum the number of schools per district and change to dataframe
SchoolsPerDist = mo_schools.groupby('CtyDist').size().to_frame().reset_index()
print(SchoolsPerDist.shape)
SchoolsPerDist.head()
(558, 2)
| CtyDist | 0 | |
|---|---|---|
| 0 | 001090 | 2 |
| 1 | 001091 | 5 |
| 2 | 001092 | 2 |
| 3 | 002089 | 3 |
| 4 | 002090 | 1 |
SchoolsPerDist.columns = ['DIST_CODE', 'SchoolsPerDist']
SchoolsPerDist.head()
| DIST_CODE | SchoolsPerDist | |
|---|---|---|
| 0 | 001090 | 2 |
| 1 | 001091 | 5 |
| 2 | 001092 | 2 |
| 3 | 002089 | 3 |
| 4 | 002090 | 1 |
print(SchoolsPerDist.shape)
SchoolsPerDist.dtypes
(558, 2)
DIST_CODE object SchoolsPerDist int64 dtype: object
print(mo_districts.shape)
mo_districts.dtypes
(556, 23)
FID int64 STATEFP object ELSDLEA object GEOID object NAME object DIST_NAME object DIST_CODE object CODIST object COUNTY object LSAD10 object LOGRADE object HIGRADE object MTFCC object SDTYP object FUNCSTAT object ALAND int64 AWATER int64 INTPTLAT object INTPTLON object Area_SqMil float64 Shape__Are float64 Shape__Len float64 geometry geometry dtype: object
##Add schools per district column
districts = pd.merge(mo_districts, SchoolsPerDist, how='left')
districts = pd.merge(districts, totalTeachers, how='left')
districts = pd.merge(districts, totalStudents, how='left')
districts.head()
| FID | STATEFP | ELSDLEA | GEOID | NAME | DIST_NAME | DIST_CODE | CODIST | COUNTY | LSAD10 | LOGRADE | HIGRADE | MTFCC | SDTYP | FUNCSTAT | ALAND | AWATER | INTPTLAT | INTPTLON | Area_SqMil | Shape__Are | Shape__Len | geometry | SchoolsPerDist | Teachers | Students | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 29 | None | 2920490 | Maryville R-II School District | Maryville R-II | 074201 | 074-201 | Nodaway | 00 | PK | 12 | G5420 | None | E | 324365817 | 612177 | +40.3426688 | -094.8788717 | 125.453015 | 5.600208e+08 | 119623.574639 | POLYGON ((-94.92889 40.43306, -94.92791 40.433... | 5 | 166 | 1465 |
| 1 | 2 | 29 | None | 2903480 | Atlanta C-3 School District | Atlanta C-3 | 061150 | 061-150 | Macon | 00 | KG | 12 | G5420 | None | E | 418773072 | 7996637 | +39.8997878 | -092.5182534 | 164.652172 | 7.260594e+08 | 204808.140357 | POLYGON ((-92.73476 40.00487, -92.73175 40.004... | 2 | 34 | 210 |
| 2 | 3 | 29 | None | 2918540 | Liberty 53 School District | Liberty 53 | 024090 | 024-090 | Clay | 00 | PK | 12 | G5420 | None | E | 212701169 | 1418667 | +39.2551406 | -094.3973310 | 82.635664 | 3.575781e+08 | 122716.191316 | POLYGON ((-94.49204 39.30984, -94.49204 39.310... | 20 | 1015 | 12815 |
| 3 | 4 | 29 | None | 2928430 | South Callaway Co. R-II School District | South Callaway Co. R-II | 014130 | 014-130 | Callaway | 00 | PK | 12 | G5420 | None | E | 498480796 | 9610382 | +38.7552288 | -091.8221408 | 196.068682 | 8.365335e+08 | 190280.448659 | POLYGON ((-91.64232 38.84370, -91.64235 38.843... | 4 | 90 | 781 |
| 4 | 5 | 29 | None | 2931440 | Waynesville R-VI School District | Waynesville R-VI | 085046 | 085-046 | Pulaski | 00 | PK | 12 | G5420 | None | E | 485531102 | 4910959 | +37.7655497 | -092.1553666 | 189.235301 | 7.860890e+08 | 187724.141675 | POLYGON ((-92.28531 37.90125, -92.28518 37.901... | 10 | 478 | 6163 |
##create a column to see students per teacher by district
districts['StudentsPerTeacherDistrictAVG'] = districts.Students/districts.Teachers
##Create a school/land area column
districts['SqMilesPerSchool'] = districts.Area_SqMil/districts.SchoolsPerDist
districts.sort_values('SqMilesPerSchool', ascending = False).head()
| FID | STATEFP | ELSDLEA | GEOID | NAME | DIST_NAME | DIST_CODE | CODIST | COUNTY | LSAD10 | LOGRADE | HIGRADE | MTFCC | SDTYP | FUNCSTAT | ALAND | AWATER | INTPTLAT | INTPTLON | Area_SqMil | Shape__Are | Shape__Len | geometry | SchoolsPerDist | Teachers | Students | StudentsPerTeacherDistrictAVG | SqMilesPerSchool | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 31 | 32 | 29 | None | 2911280 | Knox Co. R-I School District | Knox Co. R-I | 052096 | 052-096 | Knox | 00 | PK | 12 | G5420 | None | E | 1286030139 | 7261263 | +40.1284564 | -092.1560353 | 499.008667 | 2.214785e+09 | 226287.750081 | POLYGON ((-92.12443 40.30356, -92.12328 40.303... | 2 | 52 | 451 | 8.673077 | 249.504334 |
| 451 | 452 | 29 | None | 2903060 | Alton R-IV School District | Alton R-IV | 075087 | 075-087 | Oregon | 00 | PK | 12 | G5420 | None | E | 1274431039 | 3173474 | +36.7420566 | -091.3960160 | 493.142998 | 1.993678e+09 | 254588.937796 | POLYGON ((-91.65821 36.87360, -91.65847 36.875... | 2 | 69 | 681 | 9.869565 | 246.571499 |
| 8 | 9 | 29 | None | 2920700 | Scotland Co. R-I School District | Scotland Co. R-I | 099082 | 099-082 | Scotland | 00 | PK | 12 | G5420 | None | E | 983501690 | 5981916 | +40.4641152 | -092.1660064 | 434.965998 | 1.948965e+09 | 206693.578301 | POLYGON ((-91.94312 40.60583, -91.94316 40.601... | 2 | 73 | 568 | 7.780822 | 217.482999 |
| 456 | 457 | 29 | None | 2918460 | Lewis Co. C-1 School District | Lewis Co. C-1 | 056017 | 056-017 | Lewis | 00 | PK | 12 | G5420 | None | E | 1059711015 | 8690561 | +40.0678744 | -091.7538065 | 412.298632 | 1.826208e+09 | 283022.552414 | POLYGON ((-91.95068 40.26203, -91.94900 40.262... | 2 | 90 | 879 | 9.766667 | 206.149316 |
| 195 | 196 | 29 | None | 2929810 | Summersville R-II School District | Summersville R-II | 107153 | 107-153 | Texas | 00 | PK | 12 | G5420 | None | E | 921850199 | 306241 | +37.2133105 | -091.6620205 | 355.886415 | 1.456554e+09 | 241855.613900 | POLYGON ((-91.64663 37.42274, -91.64651 37.422... | 2 | 43 | 436 | 10.139535 | 177.943208 |
##This map shows which disricts have more schools.
districtsMap = folium.Map([38.318364, -92.412253], tiles='CartoDB Positron', zoom_start=6.5)
# generate choropleth map
choropleth = folium.Choropleth(
geo_data=districts,
data=districts,
columns=['NAME', 'SchoolsPerDist'],
key_on='feature.properties.NAME',
fill_color='Reds',
fill_opacity=1,
line_opacity=1,
legend_name='Schools per Schools District',
highlight=True,
smooth_factor=0).add_to(districtsMap)
# add labels indicating the name of the community
style_function = "font-size: 15px; font-weight: bold"
choropleth.geojson.add_child(
folium.features.GeoJsonTooltip(['NAME','SchoolsPerDist'], style=style_function, labels=False))
# create a layer control
folium.LayerControl().add_to(districtsMap)
<folium.map.LayerControl at 0x124f3b490>
districtsMap